The computer programming languages C and Object Pascal have similar times of origin, influences, and purposes. Both were used to design (and compile) their own compilers early in their lifetimes.
Both C and Pascal are old programming languages: The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972. While C didn't change much in time, Pascal has evolved a lot and nowadays the vast majority of Pascal programming is done in modern Object Pascal, not in the old procedural Pascal. The old procedural Pascal today is essentially limited to microcontroller programming with tools such as mikroPascal, while Object Pascal is the main dialect and is used with tools such as Delphi, Lazarus (IDE) and Free Pascal.
What is documented here is the modern Object Pascal used in Free Pascal and Delphi. The C documented is C99, as standardized in 1999. The reason is that these versions are the currently used versions of these languages. There is no reason to compare C with an old version of Pascal which is not in current use, the correct is comparing it with Pascal as used today in the Object Pascal dialect.
Contents |
Syntactically, Object Pascal is much more Algol-like than C. English keywords are retained where C uses punctuation symbols — Pascal has and
, or
, and mod
where C uses &&
, ||
, and %
for example. However, C is actually more Algol-like than Pascal regarding (simple) declarations, retaining the type-name variable-name syntax. For example, C can accept declarations at the start of any block, not just the outer block of a function.
Another, more subtle, difference is the role of the semicolon. In Pascal semicolons separate individual statements within a compound statement whereas they terminate the statement in C. They are also syntactically part of the statement itself in C (transforming an expression into a statement). This difference manifests itself primarily in two situations:
else
in Pascal whereas it is mandatory in C (unless a block statement is used)end
is not required to be followed by a semicolonA superfluous semicolon can be put on the last line before end, thereby formally inserting an empty statement.
In traditional C, there are only /* block comments */
.
In Object Pascal, there are { block comments }
, (* block comments *)
, and // Line comments
.
C and Pascal differ in their interpretation of upper and lower case. C is case sensitive while Pascal is not, thus MyLabel
and mylabel
are distinct names in C but identical in Pascal. In both languages, identifiers consist of letters and digits, with the rule that the first character may not be a digit. In C, the underscore counts as a letter, so even _abc is a valid name. Names with a leading underscore are often used to differentiate special system identifiers in C. Pascal also accepts _ character as a part of identifiers, no difference with C.
Both C and Pascal use keywords (words reserved for use by the language itself). Examples are if, while, const, for and goto, which are keywords that happen to be common to both languages. In C, the basic built-in type names are also keywords (e.g. int, char) or combinations of keywords (e.g. unsigned char), while in Pascal the built-in type names are predefined normal identifiers.
Recent Object Pascal compilers however allow to escape keywords with &, this feature is mainly need when directly communication to foreign OOP systems like COM and COCOA that might use fields and methods based on Pascal keywords. C has no way to escape keywords.
In Pascal, procedure definitions start with keywords procedure or function and type definitions with type. In C, function definitions are determined by syntactical context while type definitions use the keyword typedef
. Both languages use a mix of keywords and punctuation for definitions of complex types; for instance, arrays are defined by the keyword array in Pascal and by punctuation in C, while enumerations are defined by the keyword enum
in C but by punctuation in Pascal.
In Pascal functions, begin and end delimit a block of statements (proper), while C functions use "{" and "}" to delimit a block of statements optionally preceded by declarations. C (prior to C99) strictly defines that any declarations must occur before the statements within a particular block but allows blocks to appear within blocks, which is a way to go around this. Pascal is strict that declarations must occur before statements, but allows definitions of types and functions - not only variable declarations - to be encapsulated by function definitions to any level of depth.
The grammars of both languages are of a similar size. From an implementation perspective the main difference between the two languages is that to parse C it is necessary to have access to a symbol table for types, while in Pascal there is only one such construct, assignment. For instance, the C fragment X * Y;
could be a declaration of Y
to be an object whose type is pointer to X
, or a statement-expression that multiplies X
and Y
. The corresponding Pascal fragment var Y:^X;
is unambiguous without a symbol table.
Pascal requires all variable and function declarations to specify their type explicitly. In traditional C, a type name may be omitted in most contexts and the default type int
(which corresponds to integer
in Pascal) is then implicitly assumed (however, such defaults are considered bad practice in C and are often flagged by warnings).
C accommodates different sizes and signed and unsigned modes for integers by using modifiers such as long
, short
, signed
, unsigned
, etc. The exact meaning of the resulting integer type is machine-dependent, however, what can be guaranteed is that long int
is no shorter than int
and int
is no shorter than short int
.
In Pascal, a similar end is performed by declaring a subrange of integer (a compiler may then choose to allocate a smaller amount of storage for the declared variable):
type a = 1..100; b = -20..20; c = 0..100000;
This subrange feature is not supported by C.
A major, if subtle, difference between C and Pascal is how they promote integer operations. In Pascal, all operations on integers or integer subranges have the same effect, as if all of the operands were promoted to a full integer. In C, there are defined rules as to how to promote different types of integers, typically with the resultant type of an operation between two integers having a precision that is greater than or equal to the precisions of the operands. This can make machine code generated from C efficient on many processors. A highly optimizing Pascal compiler can reduce, but not eliminate, this effect under standard Pascal rules.
The (only) pre-Standard implementation of C as well as Small-C et al. allowed integer and pointer types to be relatively freely intermixed.
In C the character type is char
which is a kind of integer that is no longer than short int
, . Expressions such as 'x'+1
are therefore perfectly legal, as are declarations such as int i='i';
and char c=74;
.
This integer nature of char
(an eight-bit byte on most machines) is clearly illustrated by declarations such as
unsigned char uc = 255; /* common limit */ signed char sc = -128; /* common negative limit */
Whether the char
type should be regarded as signed
or unsigned
by default is up to the implementation.
In Pascal, characters and integers are distinct types. The inbuilt compiler functions ord()
and chr()
can be used to typecast single characters to the corresponding integer value of the character set in use, and vice versa. e.g. on systems using the ASCII character set ord('1') = 49
and chr(9)
is a TAB character.
In addition to Char
type, Object Pascal also has WideChar
to represent Unicode characters. In C, this is usually implemented as a macro or typedef
with name wchar_t
, which is simply an alias for int
.
In Pascal, boolean is an enumerated type. The possible values of boolean are false and true, with false=0 and true=1, other values are undefined. For conversion to integer, ord is used:
i := ord(b);
There is no standard function for integer to boolean, however, the conversion is simple in practice:
b := boolean(i); // Will raise proper rangecheck errors for undefined values with range checks on.
C has binary valued relational operators (<, >, ==, !=, <=, >=) which may be regarded as boolean in the sense that they always give results which are either zero or one. As all tests (&&, ||, ?:, if, while, etc.) are performed by zero-checks, false is represented by zero, while true is represented by any other value.
To interface with COM, Object Pascal has added ByteBool
, WordBool
and LongBool
type whose size respects their prefix and that follow the C truth table.
Free Pascal has added proper Pascal boolean types with size suffix (boolean8, 16, 32, 64
) to interface with GLIB, that uses gboolean
, a 32-bit boolean type with Pascal truth table.
The C programmer may sometimes use bitwise operators to perform boolean operations. Care needs to be taken because the semantics are different when operands make use of more than one bit to represent a value.
Pascal has another more abstract, high level method of dealing with bitwise data, sets. Sets allow the programmer to set, clear, intersect, and unite bitwise data values, rather than using direct bitwise operators. Example;
Pascal:
Status := Status + [StickyFlag]; // or Include(Status,StickyFlag); Status := Status - [StickyFlag]; // or Exclude(Status,StickyFlag); if (StickyFlag in Status) then ...
C:
Status |= StickyFlag; Status &= ~StickyFlag; if (Status & StickyFlag) { ...
Although bit operations on integers and operations on sets can be considered similar if the sets are implemented using bits, there is no direct parallel between their uses unless a non-standard conversion between integers and sets is possible.
Pascal could also do bitwise operations exactly the same way as C through the use of and
, or
, not
and xor
operators. These operators normally work on booleans, but when the operands are integers, they behave as bitwise operators. This is made possible by boolean and integer being distinct incompatible types. Therefore, the C code above could be written in Pascal as:
Status := Status or StickyFlag; Status := Status and not StickyFlag; if Status and StickyFlag <> 0 then ...
In C, string remains as pointer to the first element of a null terminated array of char, as it was in 1972. One still has to use library support from <string.h>
to manipulate strings.
Object Pascal has many string types because when a new type is introduced, the old one is kept for backwards compatibility. This happened twice, once with Delphi 2 (introduction of ansistring) and Delphi 2009 (Unicodestring). Besides the main stringtypes (short-,ansi-,wide-,unicodestring) and the corresponding character types (ansichar,widechar=unicodechar), all types derived from the character type have some string properties too (pointer to char, array of char, dynamic array of char, pointer to array of char etc).
In Object Pascal, string
is a compiler managed type and is reference-counted (if it has to be). i.e., its storage management is handled by the compiler (or more accurately, by the code injected by the compiler in the executable). String concatenation is done with +
operator, and string comparison could be done with standard relational operators (case sensitive): < <= = <> >= >
.
Object Pascal also provides C-compatible strings under the type PAnsiChar
, with manipulation routines defined in Strings
unit. Moreover, Object Pascal provides a wide variety of string types:
ShortString
, which internally is an
array [0 .. N] of Char;
ShortString
, this is because the upper limit of an unsigned byte is 255 and the container array is defined to have maximally 255 characters data (remember that 0th index contains the string length). N is given at either type definition or variable declaration (see example below)AnsiString
, dynamic unlimited length and reference counted version of ShortString
. Since Delphi 2009 it has a field that signals the encoding of the contents.WideString
, on Windows(win32/64/ce) compatible to COM BSTR, UCS2/UTF16 refcounted by COM. On non Windows equal to Unicodestring.UnicodeString
, like WideString
, but encoded in UTF-16For convenience, plain String
type is provided, which depending on compiler switch could mean ShortString
, AnsiString
or even UnicodeString
. An additional convention used, if a number of characters limit is given, it's a ShortString
, otherwise it's the other.
It's free to intermix Short-
and Ansi-
Strings when manipulating strings, the compiler will do silent conversion when required.
Example:
type TString80 = String[80]; var ss : ShortString; s80 : String[80]; // declare a (short-)string of maximum length 80 s80t: TString80; // same as above as : AnsiString; s : String; // could mean String[255], AnsiString or UnicodeString begin ss := as + s80; // YES, this is possible and conversion is done transparently by the compiler end;
In C, there's no real concept of array. There's only a pseudo construct to declare storage for multiple variables of the same type. Arrays in C don't know their own length, and they're referred through the pointer to the first element, which is why they're always 0 based. Example:
// declare int "array" named a of length 10 int a[10]; // print the first element, or more precisely element at address hold by a + 0 printf("%d",a[0]); // print the second element, or more precisely element at address hold by a + 1 printf("%d",a[1]); // pass array to a function, or more precisely pass the pointer to the first element somefunction(a); // same as above somefunction(&a[0]);
To get array length, one has to calculate sizeof(<array_variable>) / sizeof(<base_type>)
. Therefore, to count the length of an integer array, use: sizeof(intarr) / sizeof(int)
. It's a common mistake to calculate this in a function expecting array as argument. Despite of its look, functions can only accept pointer as argument, not the real array. Therefore, inside the function, the array is treated as plain pointer. Example:
// This function does NOT accept array, but a pointer to int // Semantically, it's the same as: int *a void func(int a[]) { // WRONG! Would return sizeof(pointer) / sizeof(int) int len = sizeof(a) / sizeof(int); } int main() { int a[5]; // correct, would return 5 int len = sizeof(a) / sizeof(int); func(a); return 0; }
A common solution to the problem above is to always pass array length as function argument, and functions that expect array argument should also provide placeholder its length.
Despite of its treatment as pointer, not all pointer style constructs could be used to array. For example, this code would compile fine but would cause access violation when executed:
void func(int *a) { // RUNTIME ERROR! a is allocated statically a = (int*) malloc(sizeof(int) * 10); } int main() { int a[5]; func(a); }
Care should be taken when designing such code, and documentation should explicitly state this to help users from doing such mistake.
Assignment between static arrays isn't allowed and one must use memcpy
function and its variants to copy data between arrays.
In Pascal, arrays are declared using the array
keyword, specifying its lower and upper bound, and its base type. For example:
type T10IntegerArray = array [1 .. 10] of Integer; TNegativeLowerBoundArray = array [-5 .. 5] of Integer; var IntegerArray: T10IntegerArray; NegArray: TNegativeLowerBoundArray;
Arrays know their upper and lower bound (and implicitly length), and it's passed along when a function expects array as argument. The functions Low()
, High()
and Length()
retrieve lower bound, upper bound and array length, respectively, in any context.
Without an explicit cast, arrays can't and won't be converted to pointer and is a compile time error. This is a property of type safe programming.
Assignment between static arrays is allowed. The assignment would copy all items from the source array to destination. It's mandatory that the upper and lower bound is compatible between source and destination. If somehow they're different, then one can use Move
to copy data partially. However, since Move
is a low level function, one must use it with care. It's the programmer's responsibility to ensure that data movement exceeds neither destination nor source boundary. Example:
type TArray1 = array [1 .. 10] of Integer; TArray2 = array [1 .. 5] of Integer; var a,b: TArray1; c: TArray2; begin a := b; // OK // Copy all elements from c to a, overwriting elements from the 1st index of a up to 1st index + Length(c) Move(c,a,Length(c) * SizeOf(Integer)); // Copy all elements from c to a, starting at index 5 of a Move(c,a[5],Length(c) * SizeOf(Integer)); // Copy first 5 elements from b to c Move(b,c,5 * SizeOf(Integer)); end.
C has no language support for declaring and using dynamic array. However, due to its pointer dereference syntax a dynamic array could be implemented with memory management functions, usually those from <stdlib.h>
. Example:
int size = 10; int *a = (int*) malloc(sizeof(int) * size); // allocate dynamic array of integer with size 10 int i; for (i = 0; i < size; i++) ... // do something with a[i] size *= 2; int *temp = realloc(a,sizeof(int) * size); // double the space, retaining the existing elements if (temp == NULL) error("Not enough memory!"); a = temp; ... // do something with a free(a); // free the storage
As can be seen, again the length isn't maintained automatically, and reallocation should use additional variable to protect against not enough memory error. Assignment between dynamic arrays follows pointer assignment rule.
Object Pascal provides language level support for dynamic arrays. It's declared with lower and upper bound omitted. One then must call SetLength()
function to allocate the storage. Dynamic arrays in Object Pascal are reference counted, so one doesn't have to worry about freeing the storage. Dynamic arrays are always zero-based. The three functions Low()
, High()
and Length()
would still retrieve lower bound, upper bound and array length correctly. Example:
type TIntArray = array of Integer; T2DimIntArray = array of array of Integer; var a : TIntArray; a2 : T2DimIntArray; i,j: Integer; begin SetLength(a,10); // allocate 10 storage for i := Low(a) to High(a) do ... // do something with a[i] SetLength(a2,10,10); // allocate 10 x 10 storage for i := Low(a2) to High(a2) do for j := Low(a2[i]) to High(a2[i]) do ... // do something with a[i,j] end;
Assignment between dynamic arrays copies the reference of the source array to the destination. If a real copy is required, one can use Copy
function. Example:
type TIntegerArray = array of Integer; var a,b: TIntegerArray; begin ... // initialize a and b a := b; // a now points to the same array pointed by b a[1] := 0; // b[1] should be 0 as well after this a := Copy(b,3,5); // Copy 5 elements from b starting from index 3 // a would access it from 0 to 4 however end.
|
|